AITopics | Kingston

--Simulated environments have proven invaluable in Autonomous Cyber Operations (ACO) where Reinforcement Learning (RL) agents can be trained without the computational overhead of emulation. These environments must accurately represent cybersecurity scenarios while producing the necessary signals to support RL training. In this study, we present a framework where we first extend CybORG's Cage Challenge 2 environment by implementing three new actions: Patch, Isolate, and Unisolate, to better represent the capabilities available to human operators in real-world settings. We then propose a design for agent development where we modify the reward signals and the agent's feature space to enhance training performance. T o validate these modifications, we train DQN and PPO agents in the updated environment. Our study demonstrates that CybORG can be extended with additional realistic functionality, while maintaining its ability to generate informative training signals for RL agents.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2508.19278

Country: North America > Canada > Ontario > Kingston (0.29)

Genre: Research Report > New Finding (0.35)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.52)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Comparative Evaluation of Teacher-Guided Reinforcement Learning Techniques for Autonomous Cyber Operations

Tholl, Konur, Mezouar, Mariam El, Mallah, Ranwa Al

arXiv.org Artificial IntelligenceAug-21-2025

Autonomous Cyber Operations (ACO) rely on Reinforcement Learning (RL) to train agents to make effective decisions in the cybersecurity domain. However, existing ACO applications require agents to learn from scratch, leading to slow convergence and poor early-stage performance. While teacher-guided techniques have demonstrated promise in other domains, they have not yet been applied to ACO. In this study, we implement four distinct teacher-guided techniques in the simulated CybORG environment and conduct a comparative evaluation. Our results demonstrate that teacher integration can significantly improve training efficiency in terms of early policy performance and convergence speed, highlighting its potential benefits for autonomous cybersecurity.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2508.1434

Country: North America > Canada > Ontario > Kingston (0.29)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (0.56)
Government > Military > Cyberwarfare (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

The Social Life of Industrial Arms: How Arousal and Attention Shape Human-Robot Interaction

El-Helou, Roy, Pan, Matthew K. X. J

arXiv.org Artificial IntelligenceAug-13-2025

Ingenuity Labs Research Institute Queen's University Kingston, Canada matthew.pan@queensu.ca Abstract -- This study explores how human perceptions of a non-anthropomorphic robotic manipulator can be shaped by two key dimensions of behaviour: arousal, defined as the robot's movement energy and expressiveness, and attention, defined as the robot's capacity to selectively orient toward and engage with a user . We present an integrated behaviour system that applies and extends existing movement-centric design principles to non-anthropomorphic robots. Our system combines a gaze-like attention engine with an arousal-modulated motion layer to explore how expressive and interactive behaviours influence social perception in robotic manipulators. In a user study, we find that robots exhibiting high attention--actively directing their focus toward users--are perceived as warmer and more competent, intentional, and lifelike. In contrast, high arousal--characterized by fast, expansive, and energetic motions--increases perceptions of discomfort and disturbance. Importantly, a combination of focused attention and moderate arousal yields the highest ratings of trust and sociability, while excessive arousal diminishes social engagement.

artificial intelligence, perception, robot, (12 more...)

arXiv.org Artificial Intelligence

2504.0126

Country: North America > Canada > Ontario > Kingston (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)
Questionnaire & Opinion Survey (0.88)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.89)

Add feedback

Handoff Design in User-Centric Cell-Free Massive MIMO Networks Using DRL

Ammar, Hussein A., Adve, Raviraj, Shahbazpanahi, Shahram, Boudreau, Gary, Bahceci, Israfil

arXiv.org Artificial IntelligenceAug-5-2025

--In the user-centric cell-free massive MIMO (UC-mMIMO) network scheme, user mobility necessitates updating the set of serving access points to maintain the user-centric clustering. Such updates are typically performed through handoff (HO) operations; however, frequent HOs lead to overheads associated with the allocation and release of resources. This paper presents a deep reinforcement learning (DRL)-based solution to predict and manage these connections for mobile users. Our solution employs the Soft Actor-Critic algorithm, with continuous action space representation, to train a deep neural network to serve as the HO policy. We present a novel proposition for a reward function that integrates a HO penalty in order to balance the attainable rate and the associated overhead related to HOs. We develop two variants of our system; the first one uses mobility direction-assisted (DA) observations that are based on the user movement pattern, while the second one uses history-assisted (HA) observations that are based on the history of the large-scale fading (LSF). Simulation results show that our DRL-based continuous action space approach is more scalable than discrete space counterpart, and that our derived HO policy automatically learns to gather HOs in specific time slots to minimize the overhead of initiating HOs. Our solution can also operate in real time with a response time less than 0 . Index T erms --Mobility, handoff, handover, user-centric, cell-free massive MIMO, distributed MIMO, deep-reinforcement learning, soft actor critic, machine learning, channel aging. User-centric cell-free massive MIMO (UC-mMIMO) is a wireless network architecture where each user is served by a custom group of neighboring access points (APs) which are connected to a central unit (CU) via fronthaul links [1]. Unlike the current cellular system that is based on macro base stations, UC-mMIMO deploys cooperative APs that jointly serve users without relying on a traditional cellular boundaries. UC-mMIMO helps to achieve reliable wireless connectivity and provides uniform performance throughout the network [1], [2]. However, this beyond-5G mobile wireless network architecture introduces the key challenge of determining the connections between the APs and the users when moving through the network [3].

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2507.20966

Country: